Consistency of Surrogate Risk Minimization Methods for Binary Classification using Strongly Proper Losses

نویسندگان

  • Shivani Agarwal
  • Rohit Vaish
چکیده

We learnt that under certain conditions on weights, a weighted-average plug-in classifier (or any learning algorithm that outputs such a classifier for the same training sample) is universally Bayes consistent w.r.t 0-1 loss. One might wonder for what other learning algorithms can similar statements be made. Can some of the other commonly studied/used learning algorithms be shown to be Bayes consistent w.r.t. 0-1 loss? We’ve already seen results on Bayes consistency of the ERM algorithm w.r.t 0-1 loss at the expense of computational feasibility. At the other end of the spectrum, we have algorithms like SVM, Logistic Regression etc. that are ubiquitous and computationally feasible but do not directly operate on the 0-1 loss. A natural desideratum in such a situation would be ‘the best of both worlds’ i.e. can we somehow use the minimization of ‘surrogate’ regret by commonly employed learning algorithms as a proxy for minimizing the 0-1 regret?

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consistency of Surrogate Risk Minimization Methods for Binary Classification using Classification Calibrated Losses

In the previous lecture, we saw that for a λ−strongly proper composite loss ψ, it is possible to bound the 0 − 1 regret in terms of its ψ−regret. Hence, for λ−strongly proper composite loss ψ, if we have a ψ− consistent algorithm, we can use it to obtain a 0 − 1 consistent algorithm. However, not all loss functions used as surrogates in binary classification are proper, the hinge loss being one...

متن کامل

Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses

The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be form...

متن کامل

Classification Methods with Reject Option Based on Convex Risk Minimization

In this paper, we investigate the problem of binary classification with a reject option in which one can withhold the decision of classifying an observation at a cost lower than that of misclassification. Since the natural loss function is non-convex so that empirical risk minimization easily becomes infeasible, the paper proposes minimizing convex risks based on surrogate convex loss functions...

متن کامل

Chapter 11 Surrogate Risk Consistency : the Classification Case

I. The setting: supervised prediction problem (a) Have data coming in pairs (X,Y ) and a loss L : R×Y → R (can have more general losses) (b) Often, it is hard to minimize L (for example, if L is non-convex), so we use a surrogate φ (c) We would like to compare the risks of functions f : X → R: Rφ(f) := E[φ(f(X), Y )] and R(f) := E[L(f(X), Y )] In particular, when does minimizing the surrogate g...

متن کامل

Consistency of structured output learning with missing labels

In this paper we study statistical consistency of partial losses suitable for learning structured output predictors from examples containing missing labels. We provide sufficient conditions on data generating distribution which admit to prove that the expected risk of the structured predictor learned by minimizing the partial loss converges to the optimal Bayes risk defined by an associated com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013